Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metric_value_benchmark integration tests by using latest Kafka version in JMXKafka test #436

Merged
merged 14 commits into from
Dec 2, 2024

Conversation

musa-asad
Copy link
Contributor

@musa-asad musa-asad commented Dec 2, 2024

Description of the issue

We have inconsistent results in our metric_value_benchmark tests. We fail on JMXKafka:

null_resource.integration_test_run (remote-exec): 2024/11/21 18:47:14 ==============================
null_resource.integration_test_run (remote-exec): 2024/11/21 18:47:14 ==============JMXKafka==============
null_resource.integration_test_run (remote-exec): 2024/11/21 18:47:14 ==============Failed==============
null_resource.integration_test_run (remote-exec): Starting Agent   Failed
null_resource.integration_test_run (remote-exec): 2024/11/21 18:47:14 ==============================
null_resource.integration_test_run (remote-exec): 2024/11/21 18:47:14 >>>>>>>>>>>>>>><<<<<<<<<<<<<<<
null_resource.integration_test_run (remote-exec): >>>> Finished MetricBenchmarkTestSuite
null_resource.integration_test_run (remote-exec): --- FAIL: TestMetricValueBenchmarkSuite (1035.28s)
null_resource.integration_test_run (remote-exec):     --- FAIL: TestMetricValueBenchmarkSuite/TestAllInSuite (1035.27s)
null_resource.integration_test_run (remote-exec): FAIL
null_resource.integration_test_run (remote-exec): FAIL	github.com/aws/amazon-cloudwatch-agent-test/test/metric_value_benchmark	1035.290s
null_resource.integration_test_run (remote-exec): FAIL

https://github.com/aws/amazon-cloudwatch-agent/actions/runs/11959277199/job/33340857051

The root cause was that the endpoint, https://dlcdn.apache.org/kafka/3.6.2/kafka_2.13-3.6.2.tgz, was removed due to it being an older version.

Description of changes

  • Use curl https://dlcdn.apache.org/kafka/ | grep -oP \"\\d\\.\\d\\.\\d\" | tail -1 to fetch the latest version and use that to pull the tgz file.

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Ran terraform locally:

terraform apply --auto-approve \
    -var="github_test_repo=https://github.com/aws/amazon-cloudwatch-agent-test.git" \
    -var="test_name=ubuntu-20.04" \
    -var="cwa_github_sha=738ec502c2a054e2039afb97c3a48342c87f6ac5" -var="install_agent=go run ./install/install_agent.go deb" \
    -var="github_test_repo_branch=jmx_kafka_fix" \
    -var="ec2_instance_type=t3a.medium" \
    -var="user=ubuntu" \
    -var="ami=cloudwatch-agent-integration-test-ubuntu*" \
    -var="ca_cert_path=/etc/ssl/certs/ca-certificates.crt" \
    -var="arc=amd64" \
    -var="binary_name=amazon-cloudwatch-agent.deb" \
    -var="local_stack_host_name=ec2-34-209-29-92.us-west-2.compute.amazonaws.com" \
    -var="region=us-west-2" \
    -var="s3_bucket=cloudwatch-agent-integration-bucket" \
    -var="plugin_tests=''" \
    -var="excluded_tests=''" \
    -var="test_dir=./test/metric_value_benchmark" \
    -var="agent_start="

Output:

null_resource.integration_test_run (remote-exec): 2024/12/02 20:49:09 ==============JMXKafka==============
null_resource.integration_test_run (remote-exec): 2024/12/02 20:49:09 ==============Successful==============
null_resource.integration_test_run (remote-exec): kafka.unclean.election.rate                  Successful
null_resource.integration_test_run (remote-exec): kafka.request.time.total                     Successful
null_resource.integration_test_run (remote-exec): kafka.request.time.avg                       Successful
null_resource.integration_test_run (remote-exec): kafka.request.time.99p                       Successful
null_resource.integration_test_run (remote-exec): kafka.request.time.50p                       Successful
null_resource.integration_test_run (remote-exec): kafka.request.queue                          Successful
null_resource.integration_test_run (remote-exec): kafka.request.failed                         Successful
null_resource.integration_test_run (remote-exec): kafka.request.count                          Successful
null_resource.integration_test_run (remote-exec): kafka.purgatory.size                         Successful
null_resource.integration_test_run (remote-exec): kafka.partition.under_replicated             Successful
null_resource.integration_test_run (remote-exec): kafka.partition.offline                      Successful
null_resource.integration_test_run (remote-exec): kafka.partition.count                        Successful
null_resource.integration_test_run (remote-exec): kafka.network.io                             Successful
null_resource.integration_test_run (remote-exec): kafka.message.count                          Successful
null_resource.integration_test_run (remote-exec): kafka.max.lag                                Successful
null_resource.integration_test_run (remote-exec): kafka.leader.election.rate                   Successful
null_resource.integration_test_run (remote-exec): kafka.isr.operation.count                    Successful
null_resource.integration_test_run (remote-exec): kafka.controller.active.count                Successful
null_resource.integration_test_run (remote-exec): kafka.consumer.total.records-consumed-rate   Successful
null_resource.integration_test_run (remote-exec): kafka.consumer.total.bytes-consumed-rate     Successful
null_resource.integration_test_run (remote-exec): kafka.consumer.records-consumed-rate         Successful
null_resource.integration_test_run (remote-exec): kafka.consumer.fetch-rate                    Successful
null_resource.integration_test_run (remote-exec): kafka.consumer.bytes-consumed-rate           Successful
null_resource.integration_test_run (remote-exec): kafka.producer.io-wait-time-ns-avg           Successful
null_resource.integration_test_run (remote-exec): kafka.producer.record-retry-rate             Successful
null_resource.integration_test_run (remote-exec): kafka.producer.compression-rate              Successful
null_resource.integration_test_run (remote-exec): kafka.producer.outgoing-byte-rate            Successful
null_resource.integration_test_run (remote-exec): kafka.producer.request-rate                  Successful
null_resource.integration_test_run (remote-exec): kafka.producer.byte-rate                     Successful
null_resource.integration_test_run (remote-exec): kafka.producer.request-latency-avg           Successful
null_resource.integration_test_run (remote-exec): kafka.producer.response-rate                 Successful
null_resource.integration_test_run (remote-exec): kafka.producer.record-error-rate             Successful
null_resource.integration_test_run (remote-exec): kafka.producer.record-send-rate              Successful
null_resource.integration_test_run (remote-exec): 2024/12/02 20:49:09 ==============================
null_resource.integration_test_run (remote-exec): 2024/12/02 20:49:09 >>>>>>>>>>>>>>><<<<<<<<<<<<<<<
null_resource.integration_test_run (remote-exec): >>>> Finished MetricBenchmarkTestSuite
null_resource.integration_test_run (remote-exec): --- PASS: TestMetricValueBenchmarkSuite (314.46s)
null_resource.integration_test_run (remote-exec):     --- PASS: TestMetricValueBenchmarkSuite/TestAllInSuite (314.46s)
null_resource.integration_test_run (remote-exec): PASS
null_resource.integration_test_run (remote-exec): ok  	github.com/aws/amazon-cloudwatch-agent-test/test/metric_value_benchmark	314.478s

@musa-asad musa-asad requested review from movence and lisguo December 2, 2024 17:05
@musa-asad musa-asad self-assigned this Dec 2, 2024
@musa-asad musa-asad changed the title Increase duration to 5 minutes for JMX Kafka tests. Use latest Kafka version. Dec 2, 2024
@musa-asad musa-asad changed the title Use latest Kafka version. Use latest Kafka version Dec 2, 2024
@musa-asad musa-asad marked this pull request as ready for review December 2, 2024 20:50
@musa-asad musa-asad requested a review from a team as a code owner December 2, 2024 20:50
@lisguo lisguo changed the title Use latest Kafka version Fix metric_value_benchmark integration tests by using latest Kafka version in JMXKafka test Dec 2, 2024
@musa-asad musa-asad merged commit b0df293 into main Dec 2, 2024
2 checks passed
@musa-asad musa-asad deleted the jmx_kafka_fix branch December 2, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants